{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "MNIST_ANN.ipynb",
      "provenance": [],
      "authorship_tag": "ABX9TyO5sZO//HyppgRQt7CklRJ/",
      "include_colab_link": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/Divyanshu-ISM/Oil-and-Gas-data-analysis/blob/master/MNIST_ANN.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hN0pfEdsf3OC",
        "colab_type": "text"
      },
      "source": [
        "#Image Classification with Keras and Tensorflow (ANNs)\n",
        "\n",
        "MNIST Dataset"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "X-b1NNLhnaeF",
        "colab_type": "text"
      },
      "source": [
        "#Task 1: Import Libraries."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "i4PcZi39ffrL",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "import tensorflow as tf\n",
        "import numpy as np\n",
        "import matplotlib.pyplot as plt\n",
        "import seaborn as sns\n",
        "import pandas as pd"
      ],
      "execution_count": 6,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "VIJV7AGonVyW",
        "colab_type": "text"
      },
      "source": [
        "#Task 2 : Import MNIST dataset."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "zrBLTv6_leN9",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "from tensorflow.keras.datasets import mnist"
      ],
      "execution_count": 3,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "E5dFCUQ1m42o",
        "colab_type": "text"
      },
      "source": [
        "mnist.load_data() returns training and testing data."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "l2b9WKRBmvBz",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 54
        },
        "outputId": "2c2a0ab7-24ba-45bc-9aff-421387157982"
      },
      "source": [
        "(x_train, y_train), (x_test, y_test) = mnist.load_data()"
      ],
      "execution_count": 4,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz\n",
            "11493376/11490434 [==============================] - 0s 0us/step\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qgmu6vJpoY0L",
        "colab_type": "text"
      },
      "source": [
        "We use the training data to fit and train the NN.\n",
        "\n",
        "We use the test data to test/validate the performance of our NN."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "0tq-aidIv79P",
        "colab_type": "text"
      },
      "source": [
        "##Shapes of all the datasets imported."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "zRdOPMXhnQwv",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 90
        },
        "outputId": "26b29c62-5db3-451c-8a8b-4b4e61401588"
      },
      "source": [
        "print(f'x_train :{x_train.shape}')\n",
        "print(f'y_train :{y_train.shape}')\n",
        "\n",
        "print(f'x_test :{x_test.shape}')\n",
        "print(f'y_test :{y_test.shape}')"
      ],
      "execution_count": 7,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "x_train :(60000, 28, 28)\n",
            "y_train :(60000,)\n",
            "x_test :(10000, 28, 28)\n",
            "y_test :(10000,)\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "HzOPyiT2wSBj",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "#So in (60000, 28 , 28)\n",
        "\n",
        "#We have 60000 image examples - ie, 60000 Images\n",
        "#Each Image has 28 Rows and 28 Columns - Each image is 28x28 pixel sized."
      ],
      "execution_count": 8,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qf3p7q8ExI2z",
        "colab_type": "text"
      },
      "source": [
        "Let's try and display one example."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "NDS15sGewuH4",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 283
        },
        "outputId": "04d8cfd5-72a4-4ae5-c802-82f2e7b924ae"
      },
      "source": [
        "plt.imshow(x_train[3], cmap='binary')"
      ],
      "execution_count": 12,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "<matplotlib.image.AxesImage at 0x7f4f3bd195f8>"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 12
        },
        {
          "output_type": "display_data",
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAPsAAAD4CAYAAAAq5pAIAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAMl0lEQVR4nO3db6hc9Z3H8c9n77Y+MEVjM1yjDaYWMchC0zLExWrNKhvUB8b6QJoHNYo0BaOkUGSDK9YHPojL2lJhKaSbkHTpWgqtGkS0MdQ/eVK8StZEZVdXbmhiTOaiEvvErrfffXBPym28c+7NnHPmzM33/YJhZs535vy+nNxPzsw5M/NzRAjA2e9v2m4AwHAQdiAJwg4kQdiBJAg7kMTfDnOwZcuWxcqVK4c5JJDK5OSkpqamPFetUtht3yDpJ5LGJP17RGwre/zKlSs1MTFRZUgAJbrdbt/awC/jbY9J+jdJN0q6QtIG21cMuj4Azarynn2NpHci4t2I+JOkX0paX09bAOpWJewXS/rDrPtHimV/xfYm2xO2J3q9XoXhAFTR+NH4iNgeEd2I6HY6naaHA9BHlbAflbRi1v0vFcsAjKAqYX9F0mW2v2z785K+LWlPPW0BqNvAp94i4lPb90h6TjOn3nZGxBu1dQagVpXOs0fEM5KeqakXAA3i47JAEoQdSIKwA0kQdiAJwg4kQdiBJAg7kARhB5Ig7EAShB1IgrADSRB2IAnCDiRB2IEkCDuQBGEHkiDsQBKEHUiCsANJEHYgCcIOJEHYgSQIO5AEYQeSIOxAEoQdSIKwA0kQdiAJwg4kUWkWV6BJDz/8cGn9wQcfLK1HRN/aCy+8UPrca6+9trS+GFUKu+1JSR9Lmpb0aUR062gKQP3q2LP/Q0RM1bAeAA3iPTuQRNWwh6Tf2n7V9qa5HmB7k+0J2xO9Xq/icAAGVTXsV0fE1yXdKGmz7W+e/oCI2B4R3YjodjqdisMBGFSlsEfE0eL6hKQnJK2poykA9Rs47LbPtf2FU7clrZN0qK7GANSrytH4cUlP2D61nv+MiGdr6Qop7Nq1q7S+bdu20vrY2FhpfXp6um+t+LtNZeCwR8S7kr5aYy8AGsSpNyAJwg4kQdiBJAg7kARhB5LgK65ozeHDh0vrn3zyyZA6yYE9O5AEYQeSIOxAEoQdSIKwA0kQdiAJwg4kwXl2NOr555/vW3vssccqrXvVqlWl9aeffrpvbXx8vNLYixF7diAJwg4kQdiBJAg7kARhB5Ig7EAShB1IgvPsqGT//v2l9TvuuKNv7eTJk5XGvu+++0rrl1xySaX1n23YswNJEHYgCcIOJEHYgSQIO5AEYQeSIOxAEpxnRyW7d+8urb/33nsDr3vt2rWl9dtvv33gdWc0757d9k7bJ2wfmrXsAtt7bb9dXC9ttk0AVS3kZfwuSTectmyrpH0RcZmkfcV9ACNs3rBHxEuSPjht8XpJp16/7ZZ0S819AajZoAfoxiPiWHH7fUl9f9DL9ibbE7Yner3egMMBqKry0fiICElRUt8eEd2I6HY6narDARjQoGE/bnu5JBXXJ+prCUATBg37Hkkbi9sbJT1VTzsAmjLveXbbj0taK2mZ7SOSfihpm6Rf2b5L0mFJtzXZJNozNTVVWt+xY0dpfWxsrG/t/PPPL33uAw88UFrHmZk37BGxoU/p+pp7AdAgPi4LJEHYgSQIO5AEYQeSIOxAEnzFNbnJycnS+q233trY2Pfee29p/brrrmts7IzYswNJEHYgCcIOJEHYgSQIO5AEYQeSIOxAEpxnT+7ZZ58trR88eLDS+q+/vv+XI7ds2VJp3Tgz7NmBJAg7kARhB5Ig7EAShB1IgrADSRB2IAnOs5/lnnzyydL61q3V5uS85pprSutlUzqfd955lcbGmWHPDiRB2IEkCDuQBGEHkiDsQBKEHUiCsANJcJ79LFD22+9N/u67JF166aWl9fHx8UbHx8LNu2e3vdP2CduHZi17yPZR2weKy03NtgmgqoW8jN8l6YY5lv84IlYXl2fqbQtA3eYNe0S8JOmDIfQCoEFVDtDdY/v14mX+0n4Psr3J9oTtiV6vV2E4AFUMGvafSvqKpNWSjkl6tN8DI2J7RHQjotvpdAYcDkBVA4U9Io5HxHRE/FnSzyStqbctAHUbKOy2l8+6+y1Jh/o9FsBomPc8u+3HJa2VtMz2EUk/lLTW9mpJIWlS0vca7BHzeOSRR/rWxsbGGh276vfhMTzzhj0iNsyxeEcDvQBoEB+XBZIg7EAShB1IgrADSRB2IAm+4roIHDhwoLT+3HPPNTb2zTffXFq//PLLGxsb9WLPDiRB2IEkCDuQBGEHkiDsQBKEHUiCsANJcJ59EVi3bl1p/cMPPxx43VdeeWVpvWzKZSwu7NmBJAg7kARhB5Ig7EAShB1IgrADSRB2IAnOsy8CU1NTpfUqPxe9efPm0vqSJUsGXjdGC3t2IAnCDiRB2IEkCDuQBGEHkiDsQBKEHUiC8+wj4M477yytR0RpfXp6euCxr7rqqoGfi8Vl3j277RW2f2f7Tdtv2N5SLL/A9l7bbxfXS5tvF8CgFvIy/lNJP4iIKyT9vaTNtq+QtFXSvoi4TNK+4j6AETVv2CPiWES8Vtz+WNJbki6WtF7Sqd8s2i3plqaaBFDdGR2gs71S0tck/V7SeEQcK0rvSxrv85xNtidsT/R6vQqtAqhiwWG3vUTSryV9PyJOzq7FzBGkOY8iRcT2iOhGRLfT6VRqFsDgFhR225/TTNB/ERG/KRYft728qC+XdKKZFgHUYd5Tb7YtaYektyLiR7NKeyRtlLStuH6qkQ7PAvNNubx3797S+sw/QX/nnHNO39rdd99d+tzx8TnffeEstJDz7N+Q9B1JB22f+qu9XzMh/5XtuyQdlnRbMy0CqMO8YY+I/ZL67Vqur7cdAE3h47JAEoQdSIKwA0kQdiAJwg4kwVdch+Cjjz4qrR8/frzS+i+66KK+tUcffbTSunH2YM8OJEHYgSQIO5AEYQeSIOxAEoQdSIKwA0kQdiAJwg4kQdiBJAg7kARhB5Ig7EAShB1IgrADSfB99iFYtWpVaX2+aZNffvnlOttBUuzZgSQIO5AEYQeSIOxAEoQdSIKwA0kQdiCJhczPvkLSzyWNSwpJ2yPiJ7YfkvRdSb3iofdHxDNNNbqYXXjhhaX1F198cUidILOFfKjmU0k/iIjXbH9B0qu29xa1H0fEvzbXHoC6LGR+9mOSjhW3P7b9lqSLm24MQL3O6D277ZWSvibp98Wie2y/bnun7aV9nrPJ9oTtiV6vN9dDAAzBgsNue4mkX0v6fkSclPRTSV+RtFoze/45JxWLiO0R0Y2IbqfTqaFlAINYUNhtf04zQf9FRPxGkiLieERMR8SfJf1M0prm2gRQ1bxht21JOyS9FRE/mrV8+ayHfUvSofrbA1CXhRyN/4ak70g6aPtAsex+SRtsr9bM6bhJSd9rpEMAtVjI0fj9kjxHiXPqwCLCJ+iAJAg7kARhB5Ig7EAShB1IgrADSRB2IAnCDiRB2IEkCDuQBGEHkiDsQBKEHUiCsANJOCKGN5jdk3R41qJlkqaG1sCZGdXeRrUvid4GVWdvl0TEnL//NtSwf2ZweyIiuq01UGJUexvVviR6G9SweuNlPJAEYQeSaDvs21sev8yo9jaqfUn0Nqih9Nbqe3YAw9P2nh3AkBB2IIlWwm77Btv/bfsd21vb6KEf25O2D9o+YHui5V522j5h+9CsZRfY3mv77eJ6zjn2WurtIdtHi213wPZNLfW2wvbvbL9p+w3bW4rlrW67kr6Gst2G/p7d9pik/5H0j5KOSHpF0oaIeHOojfRhe1JSNyJa/wCG7W9K+qOkn0fE3xXL/kXSBxGxrfiPcmlE/NOI9PaQpD+2PY13MVvR8tnTjEu6RdIdanHblfR1m4aw3drYs6+R9E5EvBsRf5L0S0nrW+hj5EXES5I+OG3xekm7i9u7NfPHMnR9ehsJEXEsIl4rbn8s6dQ0461uu5K+hqKNsF8s6Q+z7h/RaM33HpJ+a/tV25vabmYO4xFxrLj9vqTxNpuZw7zTeA/TadOMj8y2G2T686o4QPdZV0fE1yXdKGlz8XJ1JMXMe7BROne6oGm8h2WOacb/os1tN+j051W1EfajklbMuv+lYtlIiIijxfUJSU9o9KaiPn5qBt3i+kTL/fzFKE3jPdc04xqBbdfm9OdthP0VSZfZ/rLtz0v6tqQ9LfTxGbbPLQ6cyPa5ktZp9Kai3iNpY3F7o6SnWuzlr4zKNN79phlXy9uu9enPI2LoF0k3aeaI/P9K+uc2eujT16WS/qu4vNF2b5Ie18zLuv/TzLGNuyR9UdI+SW9Lel7SBSPU239IOijpdc0Ea3lLvV2tmZfor0s6UFxuanvblfQ1lO3Gx2WBJDhAByRB2IEkCDuQBGEHkiDsQBKEHUiCsANJ/D8K28WFOQm56wAAAABJRU5ErkJggg==\n",
            "text/plain": [
              "<Figure size 432x288 with 1 Axes>"
            ]
          },
          "metadata": {
            "tags": [],
            "needs_background": "light"
          }
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "GSL9JimMx9lA",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 35
        },
        "outputId": "8bb8cd48-4d15-4948-86aa-3e53264d751b"
      },
      "source": [
        "print(y_train[3])"
      ],
      "execution_count": 13,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "1\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "SLkOUjYhyPri",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 35
        },
        "outputId": "7e7722a7-81e1-469e-9bc3-0b1affc324ac"
      },
      "source": [
        "#Let's see how many unique labels do we have. \n",
        "#Remember? Sets only allow unique values to be stored in em. \n",
        "\n",
        "print(set(y_train))"
      ],
      "execution_count": 15,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "YxX9YMa-xkVx",
        "colab_type": "text"
      },
      "source": [
        "ANNs can identify and classify really well. \n",
        "\n",
        "The only catch is that they need a lot of training data and hence a lot of computational power. \n",
        "\n",
        "That's why you can see that we have taken 60,000 training examples."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "gRv1YXCwxPxk",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "####################################"
      ],
      "execution_count": 16,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "srrxP1KFy-Kl",
        "colab_type": "text"
      },
      "source": [
        "#Task 3: One Hot Encoding"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "0W3jEIWpzWYy",
        "colab_type": "text"
      },
      "source": [
        "n_uniquelabels = 10 (0 --> 9)\n",
        "\n",
        "> In this, every label (every y value) is converted to a list of 10 (n_unuiquelabels) elements. And hence, the element at the index coresponding to y's class will be 1, rest all 0. \n",
        "\n",
        "Examples :- list - [0,0,0,0,0,0,0,0,0,0]\n",
        "\n",
        "> 0 : [1,0,0,0,0,0,0,0,0,0]\n",
        "\n",
        "> 1 : [0,1,0,0,0,0,0,0,0,0]\n",
        "\n",
        "> 9 : [0,0,0,0,0,0,0,0,0,1]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "kMOG131M0Y9j",
        "colab_type": "text"
      },
      "source": [
        "We can get this done automatically with tensorflow.keras.utils - to_categorical , utility."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "grE2tLqoy9G3",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "from tensorflow.keras.utils import to_categorical\n",
        "\n",
        "y_train_encoded = to_categorical(y_train)\n",
        "\n",
        "y_test_encoded = to_categorical(y_test)"
      ],
      "execution_count": 17,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "WFqglqLp1L5O",
        "colab_type": "text"
      },
      "source": [
        "Just to check if the encoding worked as planned, let's check the shape of the encoded arrays as per the above mentioned concept."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "T1EAh88i1FzQ",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 35
        },
        "outputId": "6c57656a-38ff-4d3d-b39c-5add63c9832e"
      },
      "source": [
        "y_train_encoded.shape"
      ],
      "execution_count": 18,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "(60000, 10)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 18
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "RB67qViY1Xh_",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 35
        },
        "outputId": "292d4733-30a9-44e1-a420-0356c2c9258b"
      },
      "source": [
        "y_test_encoded.shape"
      ],
      "execution_count": 19,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "(10000, 10)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 19
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "oQAfE3le1dtQ",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 54
        },
        "outputId": "2b773297-5b9b-42d3-9ce4-05c7dac5e49b"
      },
      "source": [
        "print(y_train[0])\n",
        "\n",
        "print(y_train_encoded[0])"
      ],
      "execution_count": 20,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "5\n",
            "[0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "PG3CIArh2LTv",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "#################################################################"
      ],
      "execution_count": 21,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "7su_0ZaA2aPC",
        "colab_type": "text"
      },
      "source": [
        "#Task 4: Neural Networks."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "6FytkZ9T2ZOT",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "#Basic Understanding of ANNs\n",
        "\n",
        "#Our model will have a 784 --> 128 --> 128 --> 10 structure."
      ],
      "execution_count": 22,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "mCXhByCr9mf9",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "##########################"
      ],
      "execution_count": 23,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "xoLWgUqW9p7R",
        "colab_type": "text"
      },
      "source": [
        "#Task 5 : Pre-Processing the examples."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hPcMzHHV92Hh",
        "colab_type": "text"
      },
      "source": [
        "We've already converted the output labels y_train to a (60k , 10) shape by OneHotEncoding. \n",
        "\n",
        "Now we gotta convert the 28X28 pixel inputs into a (784,1) shape."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "eoJSY9hV9pCt",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "x_train_reshaped = x_train.reshape(60000,784)\n",
        "\n",
        "x_test_reshaped = x_test.reshape(10000,784)"
      ],
      "execution_count": 25,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "tLCirvqg-VeI",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 35
        },
        "outputId": "916c0d5b-2f80-4eb1-d760-9037b7b6d6e6"
      },
      "source": [
        "x_train_reshaped.shape"
      ],
      "execution_count": 26,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "(60000, 784)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 26
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "zarSyx8k-n5t",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 35
        },
        "outputId": "6027517a-40ad-45f8-8b98-f6c8322d6d71"
      },
      "source": [
        "x_test_reshaped.shape"
      ],
      "execution_count": 27,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "(10000, 784)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 27
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hIt0_ErW_xWB",
        "colab_type": "text"
      },
      "source": [
        "Let's see the unique pixel values of any example."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "IzMaFnUz-qNf",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 55
        },
        "outputId": "f30095a4-9cc3-4aec-92cd-2a639c08f61d"
      },
      "source": [
        "print(set(x_train_reshaped[0]))"
      ],
      "execution_count": 29,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "{0, 1, 2, 3, 9, 11, 14, 16, 18, 23, 24, 25, 26, 27, 30, 35, 36, 39, 43, 45, 46, 49, 55, 56, 64, 66, 70, 78, 80, 81, 82, 90, 93, 94, 107, 108, 114, 119, 126, 127, 130, 132, 133, 135, 136, 139, 148, 150, 154, 156, 160, 166, 170, 171, 172, 175, 182, 183, 186, 187, 190, 195, 198, 201, 205, 207, 212, 213, 219, 221, 225, 226, 229, 238, 240, 241, 242, 244, 247, 249, 250, 251, 252, 253, 255}\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "x6Prr-jEAG_B",
        "colab_type": "text"
      },
      "source": [
        "You can see that the values range from 0 (min) to 255(max)."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "P6ZGg4-_ANBC",
        "colab_type": "text"
      },
      "source": [
        "It would be better to normalize these values to bring it within a small/finite range, so that the errors don't blow up."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "6QwvQ3a4Acra",
        "colab_type": "text"
      },
      "source": [
        "##Data Normalization."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "msnVijL5__OH",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "x_mean = np.mean(x_train_reshaped)\n",
        "\n",
        "x_std = np.std(x_train_reshaped)"
      ],
      "execution_count": 30,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "VoESF-pyAueb",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "epsilon = 1e-10\n",
        "\n",
        "x_train_norm = (x_train_reshaped - x_mean)/(x_std + epsilon)\n",
        "\n",
        "#why epsilon? Because sometimes a very small value of std can cause error to blow up. \n",
        "#adding another small value just solves that issue."
      ],
      "execution_count": 31,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "c22ArR0MBXHq",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "#Now to avoid leakage, and to make sure whatever we do on train is what we do on test\n",
        "#to be purely unbiased, we use same mean and std to normalize test set.\n",
        "\n",
        "x_test_norm = (x_test_reshaped - x_mean)/(x_std + epsilon)"
      ],
      "execution_count": 32,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "0xLJ_PidB7qZ",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 55
        },
        "outputId": "5dfdaec4-11c4-4bcb-e179-052ea0d2477c"
      },
      "source": [
        "print(set(x_train_norm[0]))"
      ],
      "execution_count": 35,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "{-0.38589016215482896, 1.306921966983251, 1.17964285952926, 1.803310486053816, 1.6887592893452241, 2.8215433456857437, 2.719720059722551, 1.1923707702746593, 1.7396709323268205, 2.057868700961798, 2.3633385588513764, 2.096052433197995, 1.7651267538176187, 2.7960875241949457, 2.7451758812133495, 2.45243393406917, 0.02140298169794222, -0.22042732246464067, 1.2305545025108566, 0.2759611966059242, 2.210603629906587, 2.6560805059955555, 2.6051688630139593, -0.4240738943910262, 0.4668798577869107, 0.1486820891519332, 0.3905123933145161, 1.0905474843114664, -0.09314821501064967, 1.4851127174188385, 2.7579037919587486, 1.5360243604004349, 0.07231462467953861, -0.13133194724684696, 1.294194056237852, 0.03413089244334132, 1.3451056992194483, 2.274243183633583, -0.24588314395543887, 0.772349715676489, 0.75962180493109, 0.7214380726948927, 0.1995937321335296, -0.41134598364562713, 0.5687031437501034, 0.5941589652409017, 0.9378125553666773, 0.9505404661120763, 0.6068868759863008, 0.4159682148053143, -0.042236572029053274, 2.7706317027041476, 2.1342361654341926, 0.12322626766113501, -0.08042030426525057, 0.16140999989733232, 1.8924058612716097, 1.2560103240016547, 2.185147808415789, 0.6196147867316999, 1.943317504253206, -0.11860403650144787, -0.30952269768243434, 1.9942291472348024, -0.2840668761916362, 2.6306246845047574, 2.286971094378982, -0.19497150097384247, -0.39861807290022805, 0.2886891073513233, 1.7523988430722195, 2.3887943803421745, 2.681536327486354, 1.4596568959280403, 2.439706023323771, 2.7833596134495466, 2.490617666305367, -0.10587612575604877, 1.5614801818912332, 1.9051337720170087, 1.6123918248728295, 1.268738234747054, 1.9560454149986053, 2.6433525952501564, 1.026907930584471}\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "wA8hJ4eACHYv",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "###############################################################################################"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ZxzUSjtOCmgP",
        "colab_type": "text"
      },
      "source": [
        "#Task 6: Creating the model"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "iQ1jODLyCpDs",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "from tensorflow.keras.models import Sequential\n",
        "from tensorflow.keras.layers import Dense\n",
        "\n",
        "model = Sequential([\n",
        "                    Dense(128, activation='relu', input_shape=(784,)),\n",
        "                    Dense(128, activation='relu'),\n",
        "                    Dense(10, activation='softmax')\n",
        "                  ])"
      ],
      "execution_count": 36,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Ykth_rIMGKFs",
        "colab_type": "text"
      },
      "source": [
        "##Compile the Model\n",
        "\n",
        "This is where the optimization algorithm is set."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "ymKajhavGFfF",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 272
        },
        "outputId": "e96eea84-f2ea-41d8-b4b8-8109511ccb49"
      },
      "source": [
        "model.compile(\n",
        "    optimizer = 'sgd',\n",
        "    loss='categorical_crossentropy',\n",
        "    metrics=['accuracy']\n",
        ")\n",
        "\n",
        "\n",
        "model.summary()"
      ],
      "execution_count": 37,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Model: \"sequential\"\n",
            "_________________________________________________________________\n",
            "Layer (type)                 Output Shape              Param #   \n",
            "=================================================================\n",
            "dense (Dense)                (None, 128)               100480    \n",
            "_________________________________________________________________\n",
            "dense_1 (Dense)              (None, 128)               16512     \n",
            "_________________________________________________________________\n",
            "dense_2 (Dense)              (None, 10)                1290      \n",
            "=================================================================\n",
            "Total params: 118,282\n",
            "Trainable params: 118,282\n",
            "Non-trainable params: 0\n",
            "_________________________________________________________________\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "PHDU2BwGHWPJ",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "##################################################"
      ],
      "execution_count": 38,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ugBnV3CuHpWy",
        "colab_type": "text"
      },
      "source": [
        "#Task 7: Training the model."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "BQ_fqPXrIwm0",
        "colab_type": "text"
      },
      "source": [
        "##Train."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "HpIWGKnMHoi-",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 217
        },
        "outputId": "c41bfeac-d226-46f2-d30e-8a4c7a610b63"
      },
      "source": [
        "model.fit(x_train_norm,y_train_encoded,epochs=5)"
      ],
      "execution_count": 39,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Epoch 1/5\n",
            "1875/1875 [==============================] - 4s 2ms/step - loss: 0.3728 - accuracy: 0.8909\n",
            "Epoch 2/5\n",
            "1875/1875 [==============================] - 4s 2ms/step - loss: 0.1832 - accuracy: 0.9465\n",
            "Epoch 3/5\n",
            "1875/1875 [==============================] - 4s 2ms/step - loss: 0.1389 - accuracy: 0.9595\n",
            "Epoch 4/5\n",
            "1875/1875 [==============================] - 4s 2ms/step - loss: 0.1134 - accuracy: 0.9674\n",
            "Epoch 5/5\n",
            "1875/1875 [==============================] - 5s 3ms/step - loss: 0.0952 - accuracy: 0.9722\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "<tensorflow.python.keras.callbacks.History at 0x7f4f35311080>"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 39
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "1iPJ3XnfIPdr",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "#Can see that the model accuracey is increasing epoch by epoch. \n",
        "#epoch can be thought of a one full-fledged pass throughout all the examples."
      ],
      "execution_count": 40,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hy7dbwDYIvFU",
        "colab_type": "text"
      },
      "source": [
        "##Evaluate."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "X_8YnAVBIg0c",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 54
        },
        "outputId": "07c63fa1-5fe7-4ece-d203-bd46fe9bf51e"
      },
      "source": [
        "loss, accuracy = model.evaluate(x_test_norm,y_test_encoded)\n",
        "\n",
        "#Note that this step doesn't train the model on the test data. \n",
        "#It uses the trained state of our model as in step 39 above (after 5 epochs)\n",
        "#And checks how well the model performs on this new data. \n",
        "#Just a Forward Pass. No bakward pass. No learning. \n",
        "\n",
        "print(f'Model Accuracy is :{accuracy*100}%')"
      ],
      "execution_count": 41,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "313/313 [==============================] - 0s 1ms/step - loss: 0.1049 - accuracy: 0.9683\n",
            "Model Accuracy is :96.82999849319458%\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "MbEoTRvvJsOO",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "#So we can comfortably say that the model has understood how to identify numbers."
      ],
      "execution_count": 42,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "moISPxeIJyiT",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "#####################################################"
      ],
      "execution_count": 43,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "7wJhRtOoKDFa",
        "colab_type": "text"
      },
      "source": [
        "#Task 8: Model Predictions."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "B0Mx0iSEKBlu",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 35
        },
        "outputId": "765509fd-d2fa-45e7-a1bd-8007638b799b"
      },
      "source": [
        "y_p = model.predict(x_test_norm)\n",
        "\n",
        "y_p.shape"
      ],
      "execution_count": 44,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "(10000, 10)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 44
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_xC2wQ-6Kb46",
        "colab_type": "text"
      },
      "source": [
        "##Plotting the results."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "YrDDS1jJKXoj",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "##TO DO.\n",
        "##Brilliant idea.\n",
        "###Part 8. Predictions."
      ],
      "execution_count": 46,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "M-O7V7_QMhTR",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        ""
      ],
      "execution_count": null,
      "outputs": []
    }
  ]
}